A new clustering evaluation function using Renyi's information potential

نویسندگان

Erhan Gokcay

José Carlos Príncipe

چکیده

Clustering is an important unsupervised learning paradigm, but so far the traditional methodologies are mostly based on the minimization of the variance between the data and the cluster means. Here we propose a new evaluation function based on a recently developed information theoretic measure defined from Renyi’s entropy. We show how to apply Renyi’s entropy to clustering and analyze the resulting staircase nature of the performance function that can be expected during learning. We suggest simulated annealing as a possible optimization criterion.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Design a Hybrid Recommender System Solving Cold-start Problem Using Clustering and Chaotic PSO Algorithm

One of the main challenges of increasing information in the new era, is to find information of interest in the mass of data. This important matter has been considered in the design of many sites that interact with users. Recommender systems have been considered to resolve this issue and have tried to help users to achieve their desired information; however, they face limitations. One of the mos...

متن کامل

A Hybrid Time Series Clustering Method Based on Fuzzy C-Means Algorithm: An Agreement Based Clustering Approach

In recent years, the advancement of information gathering technologies such as GPS and GSM networks have led to huge complex datasets such as time series and trajectories. As a result it is essential to use appropriate methods to analyze the produced large raw datasets. Extracting useful information from large data sets has always been one of the most important challenges in different sciences,...

متن کامل

یادگیری نیمه نظارتی کرنل مرکب با استفاده از تکنیک‌های یادگیری معیار فاصله

Distance metric has a key role in many machine learning and computer vision algorithms so that choosing an appropriate distance metric has a direct effect on the performance of such algorithms. Recently, distance metric learning using labeled data or other available supervisory information has become a very active research area in machine learning applications. Studies in this area have shown t...

متن کامل

A New Method for Duplicate Detection Using Hierarchical Clustering of Records

Accuracy and validity of data are prerequisites of appropriate operations of any software system. Always there is possibility of occurring errors in data due to human and system faults. One of these errors is existence of duplicate records in data sources. Duplicate records refer to the same real world entity. There must be one of them in a data source, but for some reasons like aggregation of ...

متن کامل

خوشه‌بندی فراابتکاری اسناد فارسی اِکس‌اِم‌اِل مبتنی بر شباهت ساختاری و محتوایی

Due to the increasing number of documents, XML, effectively organize these documents in order to retrieve useful information from them is essential. A possible solution is performed on the clustering of XML documents in order to discover knowledge. Clustering XML documents is a key issue of how to measure the similarity between XML documents. Conventional clustering of text documents using a do...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2000

A new clustering evaluation function using Renyi's information potential

نویسندگان

چکیده

منابع مشابه

Design a Hybrid Recommender System Solving Cold-start Problem Using Clustering and Chaotic PSO Algorithm

A Hybrid Time Series Clustering Method Based on Fuzzy C-Means Algorithm: An Agreement Based Clustering Approach

یادگیری نیمه نظارتی کرنل مرکب با استفاده از تکنیک‌های یادگیری معیار فاصله

A New Method for Duplicate Detection Using Hierarchical Clustering of Records

خوشه‌بندی فراابتکاری اسناد فارسی اِکس‌اِم‌اِل مبتنی بر شباهت ساختاری و محتوایی

عنوان ژورنال:

اشتراک گذاری